Goto

Collaborating Authors

 bayesian additive regression tree


MLCBART: Multilabel Classification with Bayesian Additive Regression Trees

arXiv.org Machine Learning

Multilabel Classification (MLC) deals with the simultaneous classification of multiple binary labels. The task is challenging because, not only may there be arbitrarily different and complex relationships between predictor variables and each label, but associations among labels may exist even after accounting for effects of predictor variables. In this paper, we present a Bayesian additive regression tree (BART) framework to model the problem. BART is a nonparametric and flexible model structure capable of uncovering complex relationships within the data. Our adaptation, MLCBART, assumes that labels arise from thresholding an underlying numeric scale, where a multivariate normal model allows explicit estimation of the correlation structure among labels. This enables the discovery of complicated relationships in various forms and improves MLC predictive performance. Our Bayesian framework not only enables uncertainty quantification for each predicted label, but our MCMC draws produce an estimated conditional probability distribution of label combinations for any predictor values. Simulation experiments demonstrate the effectiveness of the proposed model by comparing its performance with a set of models, including the oracle model with the correct functional form. Results show that our model predicts vectors of labels more accurately than other contenders and its performance is close to the oracle model. An example highlights how the method's ability to produce measures of uncertainty on predictions provides nuanced understanding of classification results.


An Infinite BART model

arXiv.org Machine Learning

Bayesian additive regression trees (BART) are popular Bayesian ensemble models used in regression and classification analysis. Under this modeling framework, the regression function is approximated by an ensemble of decision trees, interpreted as weak learners that capture different features of the data. In this work, we propose a generalization of the BART model that has two main features: first, it automatically selects the number of decision trees using the given data; second, the model allows clusters of observations to have different regression functions since each data point can only use a selection of weak learners, instead of all of them. This model generalization is accomplished by including a binary weight matrix in the conditional distribution of the response variable, which activates only a specific subset of decision trees for each observation. Such a matrix is endowed with an Indian Buffet process prior, and sampled within the MCMC sampler, together with the other BART parameters. We then compare the Infinite BART model with the classic one on simulated and real datasets. Specifically, we provide examples illustrating variable importance, partial dependence and causal estimation.


Bayesian Additive Regression Trees for functional ANOVA model

arXiv.org Machine Learning

Bayesian Additive Regression Trees (BART) is a powerful statistical model that leverages the strengths of Bayesian inference and regression trees. It has received significant attention for capturing complex non-linear relationships and interactions among predictors. However, the accuracy of BART often comes at the cost of interpretability. To address this limitation, we propose ANOVA Bayesian Additive Regression Trees (ANOVA-BART), a novel extension of BART based on the functional ANOVA decomposition, which is used to decompose the variability of a function into different interactions, each representing the contribution of a different set of covariates or factors. Our proposed ANOVA-BART enhances interpretability, preserves and extends the theoretical guarantees of BART, and achieves superior predictive performance. Specifically, we establish that the posterior concentration rate of ANOVA-BART is nearly minimax optimal, and further provides the same convergence rates for each interaction that are not available for BART. Moreover, comprehensive experiments confirm that ANOVA-BART surpasses BART in both accuracy and uncertainty quantification, while also demonstrating its effectiveness in component selection. These results suggest that ANOVA-BART offers a compelling alternative to BART by balancing predictive accuracy, interpretability, and theoretical consistency.


Probit Monotone BART

arXiv.org Machine Learning

Bayesian Additive Regression Trees (BART) of Chipman et al. (2010) has proven to be a powerful tool for nonparametric modeling and prediction. Monotone BART (Chipman et al., 2022) is a recent development that allows BART to be more precise in estimating monotonic functions. We further these developments by proposing probit monotone BART, which allows the monotone BART framework to estimate conditional mean functions when the outcome variable is binary.


Learning Conditional Average Treatment Effects in Regression Discontinuity Designs using Bayesian Additive Regression Trees

arXiv.org Machine Learning

Such designs arise when treatment assignment is based on whether a particular covariate -- referred to as the running variable -- lies above or below a known value, referred to as the cutoff value. Because treatment is deterministically assigned as a known function of the running variable, RDDs are trivially deconfounded: treatment assignment is independent of the outcome variable, given the running variable (because treatment is conditionally constant). However, estimation of treatment effects in RDDs is more complicated than simply controlling for the running variable, because doing so introduces a complete lack of overlap, which is the other key condition needed to justify regression adjustment for causal inference. Nonetheless, treatment effects at the cutoff may still be identified. Specifically, it is well-known that treatment effects at the cutoff can be estimated from RDDs as the magnitude of a discontinuity in the conditional mean response function at that point (Hahn et al., 2001). This paper investigates the use of Bayesian additive regression tree models (Chipman et al., 2010; Hahn et al., 2020) for the purpose of estimating conditional average treatments effects (CATE) at the cutoff, conditional on observed covariates other than the running variable. To the best of our knowledge, such data-driven CATE estimation has not been a focus of the existing RDD literature and we are the first to propose BART for this purpose.


Functional Bayesian Additive Regression Trees with Shape Constraints

arXiv.org Machine Learning

Motivated by the great success of Bayesian additive regression trees (BART) on regression, we propose a nonparametric Bayesian approach for the function-on-scalar regression problem, termed as Functional BART (FBART). Utilizing spline-based function representation and tree-based domain partition model, FBART offers great flexibility in characterizing the complex and heterogeneous relationship between the response curve and scalar covariates. We devise a tailored Bayesian backfitting algorithm for estimating the parameters in the FBART model. Furthermore, we introduce an FBART model with shape constraints on the response curve, enhancing estimation and prediction performance when prior shape information of response curves is available. By incorporating a shape-constrained prior, we ensure that the posterior samples of the response curve satisfy the required shape constraints (e.g., monotonicity and/or convexity). Our proposed FBART model and its shape-constrained version are the new advances of BART models for functional data. Under certain regularity conditions, we derive the posterior convergence results for both FBART and its shape-constrained version. Finally, the superiority of the proposed methods over other competitive counterparts is validated through simulation experiments under various settings and analyses of two real datasets.


Very fast Bayesian Additive Regression Trees on GPU

arXiv.org Machine Learning

BART Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression method, introduced by Chipman, George, and McCulloch (2006, 2010). It defines a prior distribution over the space of functions by representing them as a sum of binary decision trees, and then specifying a stochastic tree generation process. The posterior is then obtained with Metropolis-Gibbs sampling over the trees. See Hill, Linero, and Murray (2020) for a review, and Daniels, Linero, and Roy (2023, ch. 5) for a textbook treatment. BART's success BART has proven empirically effective, and is gaining popularity (consider, e.g., Tan and Roy 2019). The Atlantic Causal Inference Conference (ACIC) Data Challenge has confirmed BART as one of the best regression methods for causal inference (Dorie et al. 2019; Gruber et al. 2019; Hahn, Dorie, and Murray 2019; Thal and Finucane 2023). Many BART variants have been developed throughout the years, adding features such as variable selection (Linero 2018).


On the Gaussian process limit of Bayesian Additive Regression Trees

arXiv.org Machine Learning

Bayesian Additive Regression Trees (BART) is a nonparametric Bayesian regression technique of rising fame. It is a sum-of-decision-trees model, and is in some sense the Bayesian version of boosting. In the limit of infinite trees, it becomes equivalent to Gaussian process (GP) regression. This limit is known but has not yet led to any useful analysis or application. For the first time, I derive and compute the exact BART prior covariance function. With it I implement the infinite trees limit of BART as GP regression. Through empirical tests, I show that this limit is worse than standard BART in a fixed configuration, but also that tuning the hyperparameters in the natural GP way yields a competitive method, although a properly tuned BART is still superior. The advantage of using a GP surrogate of BART is the analytical likelihood, which simplifies model building and sidesteps the complex BART MCMC. More generally, this study opens new ways to understand and develop BART and GP regression. The implementation of BART as GP is available in the Python package https://github.com/Gattocrucco/lsqfitgp .


Bayesian Causal Forests for Longitudinal Data: Assessing the Impact of Part-Time Work on Growth in High School Mathematics Achievement

arXiv.org Machine Learning

Modelling growth in student achievement is a significant challenge in the field of education. Understanding how interventions or experiences such as part-time work can influence this growth is also important. Traditional methods like difference-in-differences are effective for estimating causal effects from longitudinal data. Meanwhile, Bayesian non-parametric methods have recently become popular for estimating causal effects from single time point observational studies. However, there remains a scarcity of methods capable of combining the strengths of these two approaches to flexibly estimate heterogeneous causal effects from longitudinal data. Motivated by two waves of data from the High School Longitudinal Study, the NCES' most recent longitudinal study which tracks a representative sample of over 20,000 students in the US, our study introduces a longitudinal extension of Bayesian Causal Forests. This model allows for the flexible identification of both individual growth in mathematical ability and the effects of participation in part-time work. Simulation studies demonstrate the predictive performance and reliable uncertainty quantification of the proposed model. Results reveal the negative impact of part time work for most students, but hint at potential benefits for those students with an initially low sense of school belonging. Clear signs of a widening achievement gap between students with high and low academic achievement are also identified. Potential policy implications are discussed, along with promising areas for future research.


Ensembles of Probabilistic Regression Trees

arXiv.org Machine Learning

Tree-based ensemble methods such as random forests, gradient-boosted trees, and Bayesianadditive regression trees have been successfully used for regression problems in many applicationsand research studies. In this paper, we study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution. We prove thatthe ensemble versions of probabilistic regression trees considered are consistent, and experimentallystudy their bias-variance trade-off and compare them with the state-of-the-art interms of performance prediction.